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We investigate the robustness of nonlinear filtering for continu- 
ous time finite state Markov chains, observed in white noise, with 
respect to misspecification of the model parameters. It is shown that 
the distance between the optimal filter and that with incorrect model 
parameters converges to zero uniformly over the infinite time inter- 

Mh I val as the misspecified model converges to the true model, provided 

ri . the signal obeys a mixing condition. The filtering error is controlled 

through the exponential decay of the derivative of the nonlinear fil- 
ter with respect to its initial condition. We allow simultaneously for 
misspecification of the initial condition, of the transition rates of the 
signal, and of the observation function. The first two cases are treated 
by relatively elementary means, while the latter case requires the use 

^►^ ■ of Skorokhod integrals and tools of anticipative stochastic calculus. 

in 

1. Introduction. The theory of nonlinear filtering concerns the estima- 
tion of a signal corrupted by white noise, and has diverse applications in 
O I target tracking, signal processing, automatic control, finance, and so on. 

^O ■ The basic setting of the theory involves a Markov signal process, for ex- 

ample, the solution of a (nonlinear) stochastic differential equation or a 
finite-state Markov process, observed in independent corrupting noise. The 
C^ . calculation of the resulting filters is a classical topic in stochastic analysis 

H i [14]. Of course, the filtering equations will depend explicitly on the model 

chosen for the signal process and observations; in almost all realistic applica- 
tions, however, the model that underlies the filter is only an approximation 
p\ ' of the true system that generates the observations. In order for the theory to 

be practically useful, it is important to establish that the filtered estimates 
are not too sensitive to the choice of underlying model. 
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2 P. CHIGANSKY AND R. VAN HANDEL 

Continuity with respect to the model parameters of nonhnear filtering 
estimates on a fixed finite time interval is well established, for example, 
[3, 4, 10]; generally speaking, it is known that the error incurred in a finite 
time interval due to the choice of incorrect model parameters can be made 
arbitrarily small if the model parameters are chosen sufficiently close to those 
of the true model. As the corresponding error bounds grow rapidly with the 
length of the time interval, however, such estimates are of little use if we are 
interested in robustness of the filter over a long period of time. One would 
like to show that the approximation errors do not accumulate, so that the 
error remains bounded uniformly over an infinite time interval. 

The model robustness of nonlinear filters on the infinite time horizon was 
investigated in discrete time in [6, 12, 13, 19]. The key idea that allows one to 
control the accumulation of approximation errors is the asymptotic stability 
property of many nonlinear filters, which is the focus of much recent work 
(see [1, 2] and the references therein) and can be summarized as follows. The 
optimal nonlinear filter is a recursive equation that is initialized with the true 
distribution of the signal process at the initial time. If the filter is initialized 
with a different distribution, then the resulting filtered estimates are no 
longer optimal (in the least-squares sense) . The filter is called asymptotically 
stable if the solution of the wrongly initialized filter converges to the solution 
of the correctly initialized filter at large times; that is, the filter "forgets" 
its initial condition after a period of observation. 

Using an approximate filter rather than the optimal filter is equivalent to 
using the optimal filter where we make an approximation error after every 
time step. Now suppose the optimal filter forgets its initial condition at an 
exponential rate; then also the approximation error at each time step is for- 
gotten at an exponential rate, and the errors cannot accumulate in time. If 
the approximation error at each time step is bounded (finite time robust- 
ness), then the total approximation error will be bounded uniformly in time. 
Model robustness on the infinite time horizon is thus a consequence of finite 
time robustness together with the exponential forgetting property of the 
filter. This is precisely the method used in [6, 12, 13, 19], and its implemen- 
tation is fairly straightforward once bounds on the exponential forgetting 
rate of the filter have been obtained. However, the method used there does 
not extend to nonlinear filtering in continuous time; even the continuous 
time model with point process observations studied in [6], though more in- 
volved, reduces essentially to discrete (but random) observation times. The 
continuous time case requires different tools, which we develop in this paper 
in the setting of nonlinear filtering of a finite-state Markov signal process. 
(We also mention [7], where a different but related problem is solved.) 

We consider the following filtering setup. The signal process X = {Xt)t>o 
is a continuous time, homogeneous Markov chain with values in the finite 
alphabet § = {ai, . . . , ad}, transition intensities matrix A = (Ajj) and initial 
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distribution i/' = P{Xq = Oj). The observation process Y = (Yt)t>o is given 
by 

(1.1) Yt= f \i{Xs)ds + Bt, 

JO 

where h:S — > M is the observation function [we will also write /i* = h(aj)] 
and i? is a Wiener process that is independent of X. The filtering problem 
for this model concerns the calculation of the conditional probabilities vrj = 
P(Xt = ai\-!^^) from the observations {Yg : s < i}, where -^Y = a{Ys : s < t}. 
It is well known that ttj satisfies the Wonham equation [14, 22] 

(1.2) d7Tt = A*7rtdt + {H-h*7rt)7Tt{dYt-h*TTtdt), ttq = z^, 

where x* denotes the transpose of x and H = diag h. Note that the Wonham 
equation is initialized with the true distribution of Xq; we will denote by 
7r((/i) the solution of the Wonham equation at time t with an arbitrary initial 
distribution ttq = /i, and by TTg^tilj) ^^^ solution of the Wonham equation at 
time t>s with the initial condition vr^ = /i. In [1, 2] the exponential forget- 
ting property of the Wonham filter was established as follows: the £i-distance 
|7rt(/i) — 7rj(i/)| decays exponentially a.s., provided the initial distributions 
are equivalent /i ~ z^ and that the mixing condition Xij > Vi 7^ j is satisfied. 
Now consider the Wonham filter with incorrect model parameters: 

(1.3) dTrt=A*7rtdt+{H-h*7rt)Trt{dYt-h*Trtdt), ttq = i^, 

where A and h denote a transition intensities matrix and observation func- 
tion that do not match the underlying signal-observation model {X,Y), 
H = diag/i, and we denote by '7rj(//) the solution of this equation with initial 
condition ttq = t-i- and by TTg^tip) the solution with tTs = fi. The following is 
the main result of this paper. 

Theorem 1.1. Suppose u^ fj,^ > \/i and Xij, Xij > Vi^ j. Then 

supEll^t(^) - TTt{iy)f < Ci \^L -iy\ + C2\h-h\ + CgjA* - A*|, 
t>o 

where |A* - A*| = sup{|(A* - A*)r| :r* > Vi, |r| = 1} and the quantities 
Ci,C2,C3 are bounded on any compact subset of parameters {{u, A, h,fi, A, h) : 
u\ /x* > Vi, \u\ = \^t\ = 1, Xij,Xij > Vi / i, Ej hj = Ej h = Vi}. Addi- 
tionally we have the asymptotic estimate 

limsupEll^i(^) - 7rt(i/)f < C2I/1 -h\ + CsjA* - A*|. 

In particular, this implies that if v^ > Vi, Xij > yiy^j, then 

lim supE||7fj(^) — 7rt(;/)|| = lim limsupE||7rj(^) — 7rj(z^)|| = 0. 

{h,A,fi)^{h,A,u) t>0 {h,A)^{h,A) t^oc 
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Let us sketch the basic idea of the proof. Rather than considering the 
Wonham filter, let us demonstrate the idea using the following simple car- 
icature of a filtering equation. Consider a smooth "observation" yt and a 
"filter" whose state xt is propagated by the ordinary differential equation 
dxt/dt = f{xt,yt)- Similarly, we consider the "approximate filter" dxt/dt = 
f{xt,yt) and assume that everything is sufficiently smooth, so that for fixed 
y both equations generate a two-parameter flow xj = ip^^{xs), xt = (frj.i{xs)- 
The following calculation is straightforward: 






Diflti^l^x)) ■ {f{ipl{x),ys)-f{ipl{x),ys))ds, 


where Dif^ j (x) ■ v denotes the directional derivative of 'f^^{x) in the direction 
V. Hence we obtain the following estimate on the approximation error: 

\^o,ti^) - 'fo,tix)\ < / \Diflt{ipl {x))\ \f{ipl{x),ys)-f{^lsi^),ys)\ds. 
Jo 

Now suppose that |/(-, •) — /(•, ■){ < K, where i^ ^ as / — > /; this is an 
expression of finite-time robustness, as it ensures that \ipQ^{x) — (^q j(x)| < 

Kt -^ (for fixed t) as f ^> f ■ Suppose furthermore that we can establish 
a bound of the form \D(p^g^{-)\ < Ce~^^^~^' , that is an infinitesimal pertur- 
bation to the initial condition is forgotten at an exponential rate. Then 
the estimate above is uniformly bounded and converges to zero uniformly in 
time as f ^> f . Conceptually this is similar to the logic used in discrete time, 
but we have to replace the exponential forgetting of the initial condition by 
the requirement that the derivative of the filter with respect to its initial 
condition decays exponentially. 

Returning to the Wonham filter, this procedure can be implemented in a 
fairly straightforward way if h = h. In this case, most of the work involves 
finding a suitable estimate on the exponential decay of the derivative of the 
filter with respect to its initial condition; despite the large number of results 
on filter stability, such estimates are not available in the literature to date. 
We obtain estimates by adapting methods from [2], together with uniform 
estimates of the concentration of the optimal filter near the boundary of the 
simplex. 

The general case with h^ h is significantly more involved. The problem is 
already visible in the simple demonstration above. Note that the integrand 
on the right-hand side of the error estimate is not adapted; it depends on 
the observations on the entire interval [0, t] . As the Wonham filter is defined 
in terms of an Ito-type stochastic integral, this will certainly get us into 
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trouble. When h = h the stochastic integral cancels in the error bound and 
the problems are kept to a minimum; in the general case, however, we are 
in no such luck. Nonetheless this problem is not prohibitive, but it requires 
us to use the stochastic calculus for anticipating integrands developed by 
Nualart and Pardoux [15, 16] using Skorokhod integrals rather than Ito 
integrals and using Malliavin calculus tools. 

An entirely different application of the Malliavin calculus to problems of 
filter stability can be found in [8]. 

The remainder of this paper is organized as follows. In Section 2 we prove 
some regularity properties of the solution of the Wonham equation. We also 
demonstrate the error estimate discussed above in the simpler case h = h, 
and comment on the more general applicability of such a bound. In Section 
3 we obtain exponential bounds on the derivative of the Wonham filter with 
respect to its initial condition. Section 4 treats the general case h^h using 
anticipative stochastic calculus; some of the technical estimates appear in 
Appendix B. Finally, Appendix A contains a brief review of the results from 
the Malliavin calculus and anticipative stochastic calculus that are needed 
in the proofs. 

Notation. The signal-observation pair [X^Y) is defined on the stan- 
dard probability space (r2,^,P). The expectation with respect to P is de- 
noted by E or sometimes Ep. For x G M , we denote by |x| the £i-norm, by 
||x|| the ^2-iiorm, and by ||2;||p the £p-norm. We write x>- y (resp. ^, ^, ■<) 
if Xj >yi (<,>,<) Vi. 

The following spaces will be used throughout. Probability distributions 
on S are elements of the simplex A = {x € M :x ^ 0, |a;| = 1}. Usually, 
we will be interested in the interior of the simplex S'^~^ = {x (zW^ :x >- 
0, |a;| = 1}. The space of vectors tangent to S'^~^ is denoted by TS'^^^ = 
{x G M"^ : Yl,i^i — 0}- Finally, we will denote the positive orthant by IK4.+ = 
{xeW^:x>-Q}. 

2. Preliminaries. Equation (1.2) is a nonlinear equation for the condi- 
tional distribution ttj. It is well known however (e.g. [9]) that vrt can also be 
calculated in a linear fashion: vr^ = pt/\pt\, where the unnormalized density 
pt is propagated by the Zakai equation 

(2.1) dpt = A*ptdt + HptdYt, po = u. 

We will repeatedly exploit this representation in what follows. As before 
pt{p) and Ps,t{p) {t > s) denote the solution of the Zakai equation at time 
t with the initial condition po = p and ps = p, respectively, and -Ks^tilJ) = 

ps,t{p)l\ps,t{l-l)V 
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We also recall the following interpretation of the norm \pt\ of the unnor- 
malized conditional distribution. If we define a new measure Q ~ P through 

(2.2) ^ = |p,(z.)| = |p,|, 

then under Q the observation process Yt is an ^/^-Wiener process. This 
observation will be used in Section 4 to apply the Malliavin calculus. 

The main goal of this section is to establish some regularity properties 
of the solutions of the Wonham and Zakai equations. In particular, as we 
will want to calculate the derivative of the filter with respect to its initial 
condition, we have to establish that TTs^tifJ') is in fact differentiable. We will 
avoid problems at the boundary of the simplex by disposing of it alltogether: 
we begin by proving that if /x € S , then a.s. -Kg^tilJ^) € S for all times 
t> s. 

Lemma 2.1. P(/Os,t(At) € ]R^+ for all fi € M^+, < s < t < oo) = 1. 

Proof. The following variant on the pathwise filtering method reduces 
the Zakai equation to a random differential equation. First, we write A* = 
S + T where S is the diagonal matrix with Sa = Xu . Note that the matrix T 
has only nonnegative entries. We now perform the transformation fs,t{lj) = 
Ls,tPs,til^) where 

Ls,t = exp((iF2 - S){t -s)- H{Yt - Ys)). 
Then /s,t(/i) satisfies 

(2-3) —TJ- = Ls^tTLg^ fs,t, fs,s = M- 

Let Qc C 0, P(ric) = 1 be a set such that t >-^ Bt{uj) is continuous for every 
io G Qc- Then 1 1-^ Lg^t, t '—>' L~^ are continuous in t and have strictly positive 
diagonal elements for every tu G ^c- By standard arguments, there exists 
for every uo G flc, /i G M*^ and s > a unique solution fs,t{lj) to equation 

(2.3) where t i— > /s,t(/i) is a C^-curve. Moreover, note that Ls^tTL'^^ has 
nonnegative matrix elements for every wG^c, s <t < co. Hence if ^ G W!^^ 
then clearly fs,t{l^) must be nondecreasing, that is, fs^t ^ fs,r for every t > 
r > s and to ^Q,c- But then 1R!|__|_ must be forward invariant under equation 
(2.3) for every to G 0^ and as Lg^t has strictly positive diagonal elements the 
result follows. D 

Corollary 2.2. P(7r3,i(/i) G S'^'^ for all /x G 5'^-^ < s < t < oo) = 
1. 
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Let us now investigate the map ps^tifJ-)- As this map is hnear in /i, we can 
write Ps,t{f^) = Us^tfJ- a-S- where the d x d matrix Ug^t is the solution of 

(2.4) dUs,t = A*Us,tdt + HUs,tdYt, Us,s = I- 

The foHowing lemma establishes that Ug^t defines a linear stochastic flow in 



Lemma 2.3. For a.e. to (^Q (i) Ps^ti^j) = Ug^tfJ- for all s < t; (ii) Us^t is 
continuous in {s,t); (iii) Us^t is invertible for all s < t, where U~^ is given 
by 

(2.5) dU-; = -U-JA* dt + U-;H^ dt - U-jHdYt, U'} = I; 

(iv) Ur^tUs,r = Us^t (and hence Us^tU^} = Ur^t) for all s <r <t. 

Proof. Continuity of Us^t (and [/~j ) is a standard property of solution 
of Lipschitz stochastic differential equations. Invertibility of Uo^t for all < 
t < oo is established in [20], page 326, and it is evident that Us^t = Uq^iUq^ 
satisfies equation (2.4). The remaining statements follow, where we can use 
continuity to remove the time dependence of the exceptional set as in the 
proof of [20], page 326. D 

We now turn to the properties of the map TTg^tilJ')- 

Lemma 2.4. The Wonham filter generates a smooth stochastic semiflow 
in S , that is, the solutions -Kg^tip) satisfy the following conditions: 

1. For a.e. cv a Q, TTg^tilJ') = '^r,ti'^s,r{lJ')) for all s <r <t and p. 

2. For a.e. uj G ^, TTs^til^) is continuous in {s,t,p). 

3. For a.e. to £ Q, the injective map TTs,t{') '■ S — > S is C°° for all s <t. 

Proof. For x G M![+ define S(x) = x/\x\, so that 7rs,t{fJ-) = S(ps,t(/u)) 
{p € 5*^""^). Note that S is smooth on K++. Hence continuity in {s,t,p) 
and smoothness with respect to p follow directly from the corresponding 
properties of ps,t{l^)- The semifiow property iTs^tilJ-) = '^r,ti'^s,rilJ')) follows 
directly from Lemma 2.3. It remains to prove injectivity. 

Suppose that TTs^t{l^) = T^s,t{i^) for some /i, i^ € S'^"^ . Then Us^tl^/\Us,tlJi-\ = 
Us,ti^/\Us,tT^\, and as Ug^t is invertible we have p= {\Us,tlJ'\/\Us,ti^\)v. But as 
p and V must lie in S'^~^ , it follows that p = v. Hence '/rt(-) is injective. D 

Remark 2.5. The results in this section hold identically if we replace 
A by A, /i by h. We will use the obvious notation ns^tif^), Ps,t{^j)-, Us,t-, and 
so on. 
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We finish this section by obtaining an expression for the approximation 
error in the case h = h; in fact, we wih demonstrate the bound for this simple 
case in a more general setting than is considered in the following. Rather 
than considering the approximate Wonham filter with modified A, consider 
the equation 

(2.6) d7rt = f{7rt)dt+{H-h*7rt)7rtidYt-h*7rtdt), tto = fi e S'^-\ 

where f:S'^~^ —^ TS'^~^ is chosen in such a way that this equation has a 
strong solution and inf{i > : vf^ ^ iS'^"^} = cx) a.s. In the sequel we consider 
the case /(vf) = Avf , which clearly satisfies the requirements. We formulate 
the more general result here, as it might be of interest in other contexts (see 
Remark 2.8). 

Proposition 2.6. Let -frt be as above. Then the difference between ■Kt 
and the Wonham filter started at fj, is a.s. given by 



VTi - vrt(/i) = / DTTs^tiTi's) ■ (/(vTs) " A*T^s) ds , 
Jo 

where DiTs^tilJ^) 'V is the derivative o/7rs^j(/x) in the direction v gTS . 
Proof. Define the (scalar) process Tt by 

Tt = exp (J h*TTs dYs-\j {h*TT,f d.s) . 
Using Ito's rule, we evaluate 

(2.7) ^(r,[/o;>,) = r,[/o;](/(7f,) - A*7r,). 

Multiplying both sides by C/q^j, we obtain 

^(r,;7,,t7f,) = TsUsMi^s) - A*7f,). 

ds 

Now introduce as before the map S:M^_|_ —f S , ^(x) = x/|x|, which is 
smooth on M^^. Define the matrix DTj{x) with elements 



as^(x) _ 1 

dx^ \x\ 



[^5](x)r^- = -^ = -[5.,-sxx)]. 



Note that T,{ax) = 5](x) for any a > 0. Hence 
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But then we have, using D'E{ax) = a~^ DT,{x) (a > 0), 

= Z)S(C/,,j7f,)^,,t(/(7f,)-A*7r,). 

On the other hand, we obtain from the representation TTs^tifJ-) = '^(Us^tfJ-) 

D7rs,t{l^) ■ V = D^{U,,tt^)Us,tV, /i G S'^"\ v G TS'^~\ 

Note that /(vf^) - A*7fs G TS'^''^ as we required that f-.S'^'^ -^ TS'^'^, so 
that DS(C/,,t7r,)C/,,t(/(7f,) - A*7f,) = L'7r,,t(7f,) • (/(vr,) - A*7f,). Finally, note 
that 

ft (I 

JO "S 

and the proof is complete. D 

Corollary 2.7. The following estimate holds: 

|vfi -7rt(/x)| < / \DTTs,t{Ti's)\\f{Ti's) - A*TTs\ds, 

Jo 
where \D^^s^t{^J')\ =sup{\D7rs^t{fJ') ■ v\-v (^TS"^^^, \v\ = l}. Moreover 

|vft - vri(zy)| < |7rt(/x) - TTt{i/)\ + / |i:>7r5,t(7fs)||/(7fs) - A*7fs| ds. 

JO 

Remark 2.8. Corollary 2.7 suggests that the method used here could 
be applicable to a wider class of filter approximations than those obtained 
by misspecification of the underlying model. In particular, in the infinite- 
dimensional setting it is known [5] that by projecting the filter onto a prop- 
erly chosen finite-dimensional manifold, one can obtain finite-dimensional 
approximate filters that take a form very similar to equation (2.6). In order 
to obtain useful error bounds for such approximations one would need to 
have a fairly tight estimate on the derivative of the filter with respect to 
its initial condition. Unfortunately, worst-case estimates of the type devel- 
oped in Section 3 are not sufficiently tight to give quantitative results on 
the approximation error, even in the finite-state case. In the remainder of 
the article we will restrict ourselves to studying the robustness problem. 

In the following, it will be convenient to turn around the role of the exact 
and approximate filters in Corollary 2.7, that is, we will use the estimate 

(2.8) |7rt(z.)-7^t(^)|<|7fi(i/)-fri(/i)|+ / jZ)^,,t(7r,)||(A* - A*)7r,| ds, 

JO 

which holds provided h = h. The proof is identical to the one given above. 
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3. Exponential estimates for the derivative of the filter. In order for the 
bound equation (2.8) to be useful, we must have an exponential estimate for 
\D7rs^ti-)\- The goal of this section is to obtain such an estimate. We proceed 
in two steps. First, we use native filtering arguments as in [2] to obtain an a.s. 
exponential estimate for |L)7ro.t(;/)|. As the laws of the observation processes 
generated by signals with different initial distributions and jump rates are 
equivalent, we can extend this a.s. bound to |I?7r<j^t(A*)|- We find, however, 
that the proportionality constant in the exponential estimate depends on // 
and diverges as /i approaches the boundary of the simplex. This makes a 
pathwise bound on \DTrs^ti'^s)\ difficult to obtain, as tTs can get arbitrarily 
close to the boundary of the simplex on the infinite time interval. Instead, 
we proceed to find a uniform bound on E|Z)7rs^t(7rs)|. 

We begin by recalling a few useful results from [2]. 

Lemma 3.1. Assume /i, z^ are in the interior of the simplex. Then 

Proof. Define a new measure P'^ ~ P through 

dP^ df^ 
^P -d^^^°)- 

It is not difficult to verify that under P^^, Xt is still a finite-state Markov 
process with intensities matrix A but with initial distribution P^(Xo = a^) = 
fi\ Hence evidently vrj(^) =P'^{Xt = ai\^Y)- Using the usual change of 
measure formula for conditional expectations, we can write 

The result now follows immediately. D 

For the proof of the following lemma we refer to [2], Lemma 5.7, page 
662. 

Lemma 3.2. Define p^* = P{Xq = aj\j^^ , Xt = Cj). Assume that Xij > 
yi^ j . Then for any t>0 we have the a.s. bound 



■■\pi^ -p'/l <exp(-2tmin JXpqXqp). 
We are now ready to obtain some useful estimates. 



max I 
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Proposition 3.3. Let Xij > Vi / j and u e cS^~^ v G TS'^~^. Then 
a.s. 

It;"- I / I 

\D-Kt{iy) ■v\<Y^ — ^ exp f -2t mm yJXpqXqp 
Proof. We can calculate directly the directional derivative of (3.1): 



iDMf,)-vr 



E,- (^Vt/^)(P(X o = a„Xt = a,\^J) - 7rK/ u)P(Xo = a,\.^J)) 
E,(MV^^')P(^o = a,|^t 



ai:Y\ 



Setting /i = z/, we obtain after some simple manipulations 

{D^,(v) ■ vY = nl{i.)Y^{v^ y) 4{u) {pi' - pi'). 

The result follows from Lemma 3.2. D 

To obtain this bound we had to use the true initial distribution u, jump 
rates Xij and observation function h. However, the almost sure nature of the 
result allows us to drop these requirements. 



Corollary 3.4. Let Xij > OVi^ j and fie S'^''^, v e TS'^'^. Then a.s. 
(3.2) |D7r5,t(/i) • v\ < y] —IT exp -2{t - s) min J XpqXqp 

Moreover, the result still holds if p,v are .^J -measurable random variables 
with values a.s. in S and TS , respectively. 

Proof. Note that we can write vrg j(^) = P^{Xt = ai\^J), where P^ is 

the measure under which Xt has transition intensities matrix A and initial 
distribution p, and dYt = h(Xt) dt + dBt where Bt is a Wiener process in- 
dependent of Xt- But P^ and P are equivalent measures (by the Girsanov 
theorem and [21], Section IV. 22), so that the result for s = follows trivially 
from Proposition 3.3. The result for s > follows directly as the Wonham 
equation is time homogeneous. 

To show that the result still holds when p,v are random, note that TXs,t 
only depends on the observation increments in the interval [s,i], that is, 
DTTs^tifJ-) ■ V is ^r^^i -measurable where ^[^^1 = (^{Yr — Ys:s<r<t}. Under 
the equivalent measure Q introduced in Section 2, y is a Wiener process 
and hence '^Kt] ^^^ '^Y ^^^ independent. It follows from the bound with 
constant p, v that 

EQ(I|D*,,t(M)-t^|<{*) Wif^^ ^}) = 1' Q-a.s., 

where (*) is the right-hand side of (3.2). Hence Eq(/|£)^^ t(.iJ.)-v\<{*)) = 1) ^^^ 
the statement follows from P '^ Q. □ 
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Proposition 3.5. Let A^- >0 Vi^ j and Hi,fi2 G S^~^ . Then a.s. 

|7rs,t(//2) -vf^,t(/ii)| <C|;U2-Mi|exp(-2(t-s) min JXpgXqp], 

where C = niax{l//i^, l//-i2 '■k = l,. . . ,d\. 

Proof. Define ^{u) = Tts,t{fJ-i + n(/i2 - fii)), u G [0, 1]. Then 

Jo du Jo 

We can thus estimate 

|vi"s,t(Ai2) -vi"s,t(/Ui)| < sup |-D7rs,f(/ii +n(^2 -Ail)) • (/^2 -/^i)|- 

«e[o,i] 

The result now follows from Corollary 3.4. D 

Corollary 3.4 and Proposition 3.5 are exactly what we need to establish 
boundedness of equation (2.8). Note, however, that the right-hand side of 
(3.2) is proportional to 1///*, and we must estimate |Z)7rs^t(7rs)|. Though we 
established in Section 2 that tTs cannot hit the boundary of the simplex in 
finite time, it can get arbitrarily close to the boundary during the infinite 
time interval, thus rendering the right-hand side of equation (3.2) arbitrarily 
large. If we can establish that sup5>oE(l/minfc7rg) < oo, however, then we 
can control 'E\D7rs,t{'^s)\ to obtain a useful bound. 

We begin with an auxiliary integrability property of vrt: 

Lemma 3.6. Leti^eS'^"^ andT <oo. Then 

B [ {ni)-'' ds < oo yi = l,...,d,k>l. 
Jo 

Proof. Applying Ito's rule to the Wonham equation gives 
dlogvrj = (\, -\{h'- h*7rtf) dt + E^i^J" "^^ ^ ^^' " ^*''*^ '^^*' 

where the innovation dWt = dYt — h*TTtdt is an .^/^-Wiener process. The 
application of Ito's rule is justified by a standard localization argument, as 
vTi is in S for all t > a.s. and logx is smooth in (0, 1). As Xij > for 
j ^ i, we estimate 

-klogTvi < -klogu' - kXiit + - max(/i* - h^ft- k [ {h' - h*Tis) dWs- 

2 j Jo 
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But as /i* — h*-Kt is bounded, Novikov's condition is satisfied and hence 

Bexp(-k I {h'-h*TTs)dWs-— f {h' - h*7rsf ds\ =1. 
Estimating the time integral, we obtain 

E(^t*)~'= < {i^')-^ exp (-kXnt + ifc(A; + 1) max(/i* - h^f t\ . 

The lemma now follows by the Fubini-Tonelli theorem, as (vr*)" > a.s. 
D 



We are now in a position to bound supj>oE(l/minj7r^ 



Proposition 3.7. Let v ^S'^ ^ and suppose that Xij > Vi ^ j. Then 

SUpEj r ) < oo. 

t>0 VmmjTT^V 

Proof. By Ito's rule and using the standard localization argument, we 
obtain 

(ni)-' = (uT' - f xu{<r' ds - f\^\r'Y.^,,,rids 

Jo Jo ^.^, 

{TTl)-\h'-h*TTs)dWs+ [ i7riy\h'-h*7rsfds, 
Jo 

where Wt is the innovations Wiener process. Using Lemma 3.6 we find 
E / (7r*)-2(/i* - h*^sfds < max(/i* - h^f^ f {ttI)-^ ds < oo, 

Jo 3 Jo 

SO the expectation of the stochastic integral term vanishes. Using the Fubini- 
Tonelli theorem, we can thus write 

n{^\r') = {vT' - f h^m^]:)-')ds 

Jo 
-j^B ({TTir' Y. ^n^i) ds + J^ E{{7rir\h^ - ^ vr,)^) ds. 

Taking the derivative and estimating each of the terms, we obtain 

- < - min Ajj (M^*)^ + |Ajj| + min A^j + max(/i* - h^)'^ M^, 



dt 
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where we have written M^ = E((7rj)~^) and we have used (Mj)^ < E(7rJ)~^ 
by Jensen's inequahty. Using the estimate 



-Kl {MlY + KlMl < -KlMl + '-M- for K\ > 0, 

^1 






we now obtain 



^^ <Ki{^- Ml^ , Ki = lAiil + mm A,, + max(/i^ - h^)\ 

where K\ = min^yj \ji > 0. Consequently we obtain 

K.^ JO -i^i 



We can now estimate 



1 \ " „/l\ A/1 K^ 



supE \ <J2snpB(-) <J2(-y -^) <oo, 

which is what we set out to prove. D 

We can now prove Theorem 1.1 for the special case h = h. Using equation 
(2.8), Corollary 3.4, Proposition 3.5 and Proposition 3.7, we obtain 

E|7rt-fri(^)| 

<\^i-u\ maxj-^ V -^1 exp(^-2t mm yi^j 

+ |A* — A*|supE( l/minvrM / exp( — 2(i — s) min A/ApgAgp ) ds, 

s>0 V fc J Jo V P.95^P / 

where |A* - A*| = sup{|(A* - A*)^| :/i G 5^-1}. Thus 



T7I ~ I \\^\ I / 1 x/ 1 \ -/3i , lA* A*|Sup^>oE(l/minfc7r^) 
E7rf-7rt(;U < /x-zy max<^ ^V^^e '^^ + A -A = , 

where we have written [3 = 2minp ,j^p(ApgAqp)^'^. The result follows directly 
using llvTj -7fj(/i)||2 < IvTt -7rt(/i)| [as \'kI -7rt(/i)*| < 1]. 

4. Model robustness of the Wonham filter. We are now ready to proceed 
to the general case where the initial density, the transition intensities matrix 
and the observation function can all be misspecified. The simplicity of the 
special case h = h that we have treated up to this point is due to the fact 
that in the calculation of equation (2.7), the stochastic integral term drops 
out and we can proceed with the calculation using only ordinary calculus. 
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In the general case we cannot get rid of the stochastic integral, and hence 
we run into anticipativity problems in the next step of the calculation. 

We solve this problem by using anticipative stochastic integrals in the 
sense of Skorokhod, rather than the usual Ito integral (which is a special 
case of the Skorokhod integral defined for adapted processes only) . Though 
the Skorokhod integral is more general than the Ito integral in the sense 
that it allows some anticipating integrands, it is less general in that we 
have to integrate against a Wiener process (rather than against an arbitrary 
semimartingale) , and that the integrands should be functionals of the driving 
Wiener process. In our setup, the most convenient way to deal with this is 
to operate exclusively under the measure Q of Section 2, under which the 
observation process y is a Wiener process. At the end of the day we can 
calculate the relevant expectation with respect to the measure P by using 
the explicit expression for the Radon-Nikodym derivative dP/dQ. The fact 
that the integrands must be functionals of the underlying Wiener process 
is not an issue, as both the approximate and exact filters are functionals of 
the observations only. 

Our setup is further detailed in Appendix A, together with a review of 
the relevant results from the Malliavin calculus and anticipative stochastic 
calculus. Below we will use the notation and results from this appendix 
without further comment. We will also refer to Appendix B for some results 
on smoothness of the various integrands we encounter; these results are not 
central to the calculations, but are required for the application of the theory 
in Appendix A. 

We begin by obtaining an anticipative version of Proposition 2.6. Note 
that this result is precisely of the form one would expect. The first two lines 
follow the formula for the distance between two flows as one would guess, 
for example, from the discussion in the Introduction; the last line is an Ito 
correction term which contains second derivatives of the filter with respect 
to its initial condition. 

Proposition 4.1. The difference between ttj and nt satisfies 
Jo Jo 

— / DTTrti'^r) ' [h*TTr {H — h*T^r)T^r — h*T^r {H — h*TTT)T^r\ dr 

Jo 

+ 2 [D TTrtiT^r) ' {H — h*1Tr)T^r — D TT^ ((vTr) ■ {H — h*-Kr)T^r\ dr, 

Jo 
where the stochastic integral is a Skorokhod integral and we have written 
l^A = h* - K*, Anin) = {H- /i*7r)7r - {H - ^*7r)7r, and D'^nrAfJ-) ' ^ is the 
directional derivative of DiTr^tilJ-) "^ ™^^ respect to fx^ S"^'^ in the direction 
veTS'^-^. 
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Proof. Fix some T > t. We begin by evaluating, using Ito's rule and 
equation (2.5), 

Uo}Uo,si^ = u+ r U^l{K* - K*)Uo,rvdr 
Jo 

- rty^lH{H-H)Uo,rydr+ [' U^^ {H - H)Uo,r'^ dY^ . 

JO ' JO ' 

Now multiply from the left by C/o,ti we wish to use Lemma A. 5 to bring ?7o,t 
into the Skorokhod integral term, that is, we claim that 

Us,tUo,si^ = Uo,ti^+ rUrA^*-A*)Uo,ri^dr- rUr,tHiH-H)Uo,ri^dr 
Jo Jo 

+ rUr,t{H-H)Uo,rudYr+ f {Drtj^^t)U^}.{H - H)U^,rV dr. 
Jo Jo ' 

To justify this expression we need to verify the integr ability conditions of 
Lemma A. 5. Note that all matrix elements of Us^t are in B°° VO < s < t < T, 
and that 

J _(0, a.e. r^ [s,t], 

"''^'^' \Ur,tHUs,r, a.e.re[s,t]. 

This follows directly from Proposition A. 4 and Lemma 2.3 (note that the 
same result holds for Ug^t if we replace H hy H and U by U). Once we 
plug this result into the expression above, the corresponding integrability 
conditions can be verified explicitly, see Lemma B.l, and hence we have 
verified that 

Us,tUo,,l^ = Uo,tl^ + /' UrA^* - A*)Uo,ri^dr + /' Ur,t{H - H)Uo,rl^dYr. 
Jo Jo 

Next we would like to apply the anticipating Ito rule. Proposition A. 6, with 
the function S :M^_|_ — > S'^~^, Ti{x) = x/\x\. To this end we have to verify a 
set of technical conditions, see Lemma B.2. We obtain 

= S(;7o,iZ.) + r DJ:{Ur,tUo,r'^)UrA^* - A*)Uo,ri^dr 
Jo 

+ \T.f Q^k§^AUr,tUo,riy){VrUr,tUo,riy)HUrAH - H)U^,.yf dr 

Jo 
We need to evaluate 'WrUr,tUo,rT^- Using Proposition A. 2, we calculate 

lim DrUr+e,tUo,r+e'^ = 11™ Ur+e,tUr,r+eHUQ^rU = Ur,tHUo^rV, 
e\0 e\0 
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and similarly 

lim DrUr-e,tUo,r-e'^ = 1™ Ur,tHUr-e,rUo,r-e'^ = JJr,tHUo,rl^. 

After some rearranging, we obtain 

Jo 

-^iT^f Q^k§^iUr,tUo,rl^){Ur,tHUo,rl^)HUr,tHUo,rl^Ydr 
- ^E/' -Q^k§^{Ur,tUo,rl^){Ur,tHUo,rl^)HUr,tHUo,rl^Ydr 
+ / D^{UrtUorl^)Urt{H-H)Uorl^dYr. 

Jo 



/o 
From this point onward we will set s = t. We will need (on M!|._|_) 

ox'' ox^ \x\ 

Recall that DT.{ax) = a-'^DJ^ix); it follows that also D^T,{ax) = a'^D^J^ix) 
for a > 0. Using these expressions with a = \Uo^ri^\, we get 

TTt — TTt= / WE^Ur t'^r)Ur t^AT^r dr + / DTi{Uj. tT^r)Ur t{H — H)TTr dY^ 

Jo ' ' Jo ' ' 

+ ^E/ Q^e(^r,t7rr)iUr,tHnrY{Ur,tH7rrYdr 

-\T.f Q^J^,{Ur,t7Tr){Ur,tHTrr)HUr,tH7rrYdr. 



Next we want to express the integrands in terms of DiTr^ti'^r) ■ v, and so on, 
rather than in terms of DT,{x). Recall that DTTr^T^r) ■ v = D'E,{Ur,t'^r)Ur,tv 
when V S TS'^~^. Similar terms appear in the expression above, but, for 
example, H-Kr ^ TS'^~^ . To rewrite the expression in the desired form, we 
use that DTj{Ur,t'^r)Ur,tT^r = 0. Hence 

D'E{Ur^t'^r)Ur,tHTTr = DTi{Ur,t'^r)Ur,tiH — h*Trr)TTr 

and similarly for the other terms. Note also that 

k 



18 P. CHIGANSKY AND R. VAN HANDEL 

Substituting this into the expression for vtj — vft and rearranging, we obtain 

Jo ' Jo ' 

— / DtT^. i(7r,.) • [h*TTr {H — h*TTr)7Tr — h*TTj. {H — h*TTy.)TTj.] dv 

Jo 

1 /■* S^S 

E/ gffcg^, (^r,i7r,)(^r,f(g - /i%.)vr.)'(f/r,t(^ " /i* 7r,)7r,)' dr. 



fc,^ 



It remains to note that we can write 

k,e 
The result follows immediately. D 

Remark 4.2. We have aUowed misspecification of most model parame- 
ters of the Wonham filter. One exception is the observation noise intensity: 
we have not considered observations of the form dYt = h.{Xt) dt + a dBt with 
o" 7^ 1; in other words, the quadratic variation of Yt is assumed to be known 
[y, y]t = t. We do not consider this a significant drawback as the quadratic 
variation can be determined directly from the observation process Yt . On the 
other hand, the model parameters u, A, h are "hidden" and would have to 
be estimated, making these quantities much more prone to modeling errors. 

If we allow misspecification of a, we would have to be careful to specify in 
which way the filter is implemented: in this case, the normalized solution of 
the misspecified Zakai equation no longer coincides with the solution of the 
misspecified Wonham equation. Hence one obtains a different error estimate 
depending on whether the normalized solution of the misspecified Zakai 
equation, or the solution of the misspecified Wonham equation, is compared 
to the exact filter. Both cases can be treated using similar methods, but we 
do not pursue this here. 

Let Ci = TTt — TTf . We wish to estimate the norm of et ■ Unfortunately, we 
can no longer use the triangle inequality as in Section 2 due to the presence 
of the stochastic integral; instead, we choose to calculate ||ei|p, which is 
readily estimated. 



MODEL ROBUSTNESS OF FINITE STATE NONLINEAR FILTERING 19 



Lemma 4.3. The filtering error can he estimated by 



Ep||et| 



< / F,p\D7rr^t{'^r) • '^A'^r\dr + K EplDTTr^ti'^r) • '^H{'^r)\dr 

Jo ' Jo ' 



+ / 'EplD-TTrti'^r) ■ ih*TTr{H — h*TTr)TTr — h*TTr{H — h*TTr)'ITr)\dr 

Jo 

+ 2 / Ep|Z) TTrti'^r) ' {H — h*TTr)T^r — D 'Kj-ti'^r) ' {H — h*T\:T)T^r\ dr, 

Jo 
where K = 2 max^ | /i | + max^ \h \. 

Proof. We wish to calculate Ep|[et|p = Epe*et. Using Proposition 4.1, 
we obtain 



Ep||ej| 



Epef DtTj. tiT^r) ■ ^A'^r dr 



/o 

+ Ep 

-/' 

Jo 



4 / DTTr^tilTr) ■ AnMdYr 

Jo 

Ep elD-ftr^ti'^r) ■ [h*'7Tr {H — h*'Kr)T^r — h*-Kr (H — h*Trr)irr] dr 
+ 2 / Epe^[L' TTr tiT^r) ■ {H — h* 'Kr)'^r 

Jo 

— D TTr^tiT^r) ' {H — h*7rr)7rr] dr. 

The chief difficulty is the stochastic integral term. Using equation (2.2), we 
can write 

rt 



Ep 



e^ / DTTr,t{Trr) ' AH{TTr)dYr 

Jo 

= Eq \Uo,Me*t f Dnr,tiTTr)-AHiTTr)dYr 

L Jo 



We would like to apply equation (A.l) to evaluate this expression. First, we 
must establish that the integrand is in DomS; this does not follow directly 
from Proposition 4.1, as the anticipative Ito rule which was used to obtain 

1 2 

that result can yield integrands which are only in L^^^ . We can verify directly, 
however, that the integrand in this case is indeed in DomS, see Lemma 
B.3. Next, we must establish that |^o,ti^| ej is in D^'^ for every i. Note that 
|f^,t^| = Si(^o,t^)*) so |C/^o,ti^| is in I}°°. Moreover, we establish in Lemma 
B.4 that et G B^'^ and that D^et is a bounded random variable for every t. 
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Hence it follows from Proposition A.l that jC/o,fi^|ej €D ' . Consequently 
we can apply equation (A.l), and we obtain 

ft 



E, 



Q 



Jo 

= [ EQ[{\Uo^tu\Drel + Dr\Uo,ti^\e*t)Dnr,ti7rr) ■ /^Hi7rr)]dr 
Jo 

= I EQ[\Uo,t'^\{DrTrt-Ornt)*D^r,tM-^H{Trr)]dr 

Jo 



+ / Eq > (Ur_tHUn_rVye*D%rA^r)-^H{^r) dr. 

Jo 
Now note that \e\\ < 1, and that by Lemma B.4 

\{DrTTt -D^vft)*! < KD^TTi)*! + {{DrTTtYl <max|/i''| +max|/i''| 

k k 



Furthermore we can estimate 



Y.i{Ur,tHUo,rV) 



Wo^v 



< 



MlntV 



Y.u%\h^\uiy <u.^\h\ 



i,j,k 



where we have used a.s. nonnegativity of the matrix elements of C/o,r and 
Ur^t (this must be the case, as, for example, Ur^tfJ- has nonnegative entries 
for any vector /_i with nonnegative entries). Hence we obtain 

j[/0,tZ>|ej / DTTr^tM- ^H{T^r)dYr 

Jo 



E 



Q 



< 2max|/i I + max|/i I j / Eq | [/o,t i^ 1 1 -DvTr^i (vr^ )• Aiii'(7rr)| (ir. 

The result follows after straightforward manipulations. D 

Unlike in the case h = h, we now have to deal also with second derivatives 
of the filter with respect to its initial condition. These can be estimated 
much in the same way as we dealt with the first derivatives. 



Lemma 4.4. Let Aj,- > Vi 7^ j and /_* G S'^~^ , v,we TS'^~^ . Th 



-.2- 



en a.s. 






<2E 

k 



\ k , k\ 

\v + W \ 



fJ' 



E- 



n-> 



exp -2(t - s) min \J XpqXqp 

p,qjtp 



Moreover, the result still holds if ^,v,w are .^J -measurable random vari- 
ables with values a.s. in 5 and TS , respectively. 
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Proof. Proceeding as in the proof of Proposition 3.3, we can calculate 
directly the second derivative of (3.1): 



{D\ti^^)■vy = -2{D^Tt{^l)■v) 






Setting fi = u and using the triangle inequality, we obtain 

\v^{DTTt{l') ■ vf — W^{D'Ki{v) ■ w) 



\D'^TTt{v) ■ V - D'^TTt{v) • u;| < 2 ^ 



Another application of the triangle inequality and using Proposition 3.3 
gives 

\D TTt{l') ■ V — D TTt{l') ■ w\ 

k k 

'^ 4- m'^ I . — . III-? — 1(1-? I 



\v + W v-^ \v — w- 



„k L ^j exp ^-2tmm^Xp,X,p 

k J 

We can now repeat the arguments of Corollary 3.4 to establish that the result 
still holds if we replace vro,* by TTs^t, ^pq by Xpq, and u, v, w by ^^^ -measurable 
random variables fj,,v,w. This completes the proof. D 

We are now ready to complete the proof of Theorem 1.1. 

Proof of Theorem 1.1. Set /3 = 2minp^5^p(ApgAqp)^/^. Let us collect 
all the necessary estimates. First, we have 

/ EplDTTr tM • AA7r,.|dr </3~^supEp(l/min7rM|A* - A*|, 
Jo ' s>0 V *: / 

as we showed in Section 3. Next, we obtain 

/ Ep|Z)^^,j(7r^)-AH(vr^)|(ir<;3-^ sup ^ [/i'' - /i'' + /i*7r - /i*7r| 

using Corollary 3.4. Using the triangle inequality, we can estimate this by 

/ BplDnrtiiTr) ■ AHiTTr)\dr < {d + l)/3-^|/i - h\. 
Jo 

Next, we estimate using Corollary 3.4 

/ EpIDtTj. ^(vrr) • {h*iTr {H — h*'Kj.)'n'r — h*TTr {H — h*'Kr)T^T)\ dr 
Jo 
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<p-^ sup ^|/iV(/i^-/iV)-/i*7r(/i''-/iV)| 

< p-^ ({d + 1) max \h''\ + dmax \h'' - h^\] \h-h\, 

\ k k,i J 

where we have used the estimate 

J2 \h*TT (h^ - h*TT) - h*TT (h^ - h*TT)\ 
k 

< \h*7r\ ^ \h'' -1^ + h*Tr - h*7r\ + \h*Tr - h*7r\ ^ \h'' - h 



TT 



< (d + 1) max \h''\ \h-h\ + \h-h\J2 1^*" " ^*^l 

'' k 

< i (d+ l)max|/i^| +dmax|/i'' -h^\]\h-h\. 
\ k k,£ J 

Next we estimate using Lemma 4.4 

2 / EpjD T^rti'^r) ' {H — h*TTr)T^r — D 11^ ti^^r) ' {H — h*TTr)T^r\ dr 

Jo 

<p-^ sup ^|/i*^-/i*7r + /i^'-/i*7r|^|/i^-/iJ+/i*7r-/i*7r| 

< did + 1)3-^ (max l/i'^ - /i^l + maxlh^ - /i^l ) \h-h\. 

' \ k,e k,e J 

We have now estimated all the terms in Lemma 4.3, and hence we have 
bounded Ep||et|p = Ep||7ri(i/) — 7ri(z^)|p. It remains to allow for misspecified 
initial conditions. To this end, we estimate 

< \\etf + ||7fj(i/) - nt{fi)\\{\\nt{i^) - ^t(/i)|| + 2||7rj(i/) - ntii^)\\). 

Hence we obtain using the equivalence of finite-dimensional norms ||2;|| < 
K21 \x\ 

||vri(i^) -7rt{fi)f < \\et\\'^ + 6K21 |7rt(z^) -7ri(/u)| 

where we have used that the simplex is contained in the {d— l)-dimensional 
unit sphere, so ||^i — ^2!! < 2 V/ii,/X2 € A"^"-^. The statement of the theorem 
now follows directly from Lemma 4.3, Proposition 3.5 and the estimates 
above. D 
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APPENDIX A: ANTICIPATIVE STOCHASTIC CALCULUS 

The goal of this appendix is to recall briefly the main results of the Malli- 
avin calculus, Skorokhod integrals and anticipative stochastic calculus that 
are needed in the proofs. In our application of the theory we wish to deal 
with functionals of the observation process (i^)tg[o,T]) where T is some fi- 
nite time (usually we will calculate integrals from to t, so we can choose 
any T > t). Recall from Section 2 that Y is an ^j -Wiener process under 
the measure Q; it will thus be convenient to work always under Q, as this 
puts us directly in the framework used, for example, in [15]. As the theory 
described below is defined Q-a.s. and as P ~ Q, the corresponding proper- 
ties under P are unambiguously obtained by using equation (2.2). We will 
presume this setup whenever the theory described here is applied. 

A smooth random variable F is one of the form f {Y (hi) , . . . ,Y (hn)) , 
where Y{h) denotes the Wiener integral of the deterministic function h € 
L^([0,r]) with respect to Y and / is a smooth function which is of polyno- 
mial growth together with all its derivatives. For smooth F the Malliavin 
derivative DF is defined by 

" df 
DtF = Y,-^iY{hi),...,YiK))h,it). 

The Malliavin derivative D can be shown [15], page 26, to be closeable as 
an operator from LP{Q,^^,Q) to LP(0,^f , Q;L2([0,r])) for any p > 1, 
and we denote the domain of D in L^(0) by D^'^ [for notational convenience 
we will drop the measure Q and c-algebra ^j^ throughout this section, 
where it is understood that LP{Q,) denotes LP{Q,^^ ,Q), etc.]. More gen- 
erally, we consider iterated derivatives D F ^ LP{Q;L'^{[0,T]^)) defined by 
^ti,...,tk^ = Dj, • • • Dt^F, and the domain of D*^ in LP{n) is denoted by D'^'P. 
The domains D'^'^ can also be localized ([15], pages 44-45), and we denote 
the corresponding localized domains by Bj^^. Finally, we define the useful 

class B~ = np>infc>iro''^- 

We will use two versions of the chain rule for the Malliavin derivative. 

Proposition A. 1. Let ip -.R"^ ^ R be C^ and F = {F^,...,F'^) be a 
random vector with components in D^'^. Then ^{F) G Dj^^ and 

Iff{F) G L2(0) and Dip{F) eL'^i^x [0,T]), then ip{F) € B^'^ These results 
still hold if F a.s. takes values in an open domain V C M™ and ip is C^(V). 
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The first (local) statement is [16], Proposition 2.9; the second statement 
can be proved in the same way as [17], Lemma A.l, and the proofs are 
easily adapted to the case where F a.s. takes values in some domain. The 
next result is from [15], page 62: 

Proposition A. 2. Let ip-.W^^R be a smooth function which is of 
polynomial growth together with all its derivatives, and let F = {F^, . . . , F"^) 
be a random vector with components in 11)°°. Then (p{F) € 1!)°° and the usual 
chain rule holds. This implies that 0°° is an algebra, that is, FG E B°° for 

F,GeB°°. 

The following result follows from [15], page 32 (here [s,tY = [0,T]\[s,t]). 
Lemma A. 3. If F eB^''^ is ^X^ynieasurable, then DF = a.e. in Q x 

[s,tr. 

It is useful to be able to calculate explicitly the Malliavin derivative of 
the solution of a stochastic differential equation. Consider dxt = f{xt)dt + 
cr(xt)dYt, xq € M"^, where f{x) and cr(x) are smooth functions of x with 
bounded derivatives of all orders. It is well known that such equations gen- 
erate a smooth stochastic flow of diffeomorphisms xt = ^t{x) [11]. We now 
have the following result. 

Proposition A. 4. All components of xt belong to B°° for every t G 
[0,r]. We have D^xt = DS,t{xo)D^rixo)~^<^{xr) a.e. r <t, where {D^t{x)y^ = 
dQ{x)/dx^ is the Jacobian matrix of the flow, and D^xt = a.e. r > t. 

The first statement is given in [15], Theorem 2.2.2, page 105, the second on 
[15], equation (2.38), page 109, the third follows from adaptedness (Lemma 
A.S). 

We now consider D as a closed operator from L^{Q) to L^{Q x [0,T]) with 
domain B^'^. Its Hilbert space adjoint 5 = 0* is well defined in the usual 
sense as a closed operator from L^(Q x [0,T]) to L^(J7), and we denote its 
domain by Dom(5. The operator S is called the Skorokhod integral, and 
coincides with the Ito integral on the subspace La(^ ^ [Oj^]) ^ Domd of 
adapted square integrable processes ([15], Proposition 1.3.4, page 41). S 
is thus an extension of the Ito integral to a class of possibly anticipative 
integrands. To emphasize this point we will write 

Siul[s,t]) = / UrdYr, ul[s,t] G Dom5. 
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The Skorokhod integral has the following properties. First, its expectation 
vanishes EQd(n) =0 if u G Dom6. Second, by its definition as the adjoint 
of D we have 



(A.l) Eq(F<5(u)) = Eq 



{DtF)utdt 



if ti S DouiS, F G D^'^. We will also use the following result, the proof of 
which proceeds in exactly the same way as its one-dimensional counterpart 
([15], page 40). 

Lemma A. 5. If u is an n-vector of processes in Domd and F is an 
m X n-matrix of random variables in D^'^ such that Eq/q ||Fntpdt < oo, 
then 

[ FutdYt = F[ utdYt- I {DtF)utdt 
Jo Jo Jo 

in the sense that Fu G Dom 6 iff the right-hand side of this expression is in 

As it is difficult to obtain general statements for integrands in Dom (5, it 
is useful to single out restricted classes of integrands that are easier to deal 
with. To this end, define the spaces L'^'P = LP{[0,T];I}'''P) ior k>l, p> 2. 
Note that L'^'P C L^'^ C Dom 5 [15], page 38. Moreover, the domains L^'P 
can be localized to Lj^^ ([15], pages 43-45). We can now state an Ito change 
of variables formula for Skorokhod integrals, see [15, 16, 18]. The extension 
to processes that a.s. take values in some domain is straightforward through 
localization. 

Proposition A. 6. Consider an m-dimensional process of the form 

xt = xo+ Vsds+ UsdYs, 
Jo Jo 

where we assume that xt has a continuous version and xq G (©[q^)*", v G 
(lJ^^)"", and u G {l^f^tT'- Let ip : W^ -^ R be a C^ function. Then 

ip{xt) = (p{xo) + / Dip{xs)vsds+ / Dip{xs)usdYs 
Jo Jo 



+ 2 



i / {D'^ip{xs)V sXs,Us) ds, 
Jo 



where V^x^ = lim^^^o D^ (x^+e + Xs^e), Dip{xs)us = J2i{d(p/dx''){xs)ul, 
{D'^ip{xs)'V sXs,Us) =Y^j^j{d'^(p/dx^ dx^){xs)ul'V sxi. The result still holds if 
Xg a.s. takes values in an open domain V C M'" Vs G [0, t] and Lp is C'^iV). 
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APPENDIX B: SOME TECHNICAL RESULTS 
Lemma B.l. The following equality holds: 

Jo 

= [ Urt{H-H)Uori^dYr+ f tir tH{H - H)Uq r^ dr. 
Jo ' ' Jo ' 

The integral on the left-hand side is an ltd integral, on the right-hand side 

a Skorokhod integral. 

Proof. We have already established in the proof of Proposition 4.1 that 
the matrix elements of [/q^j are in 3°° C H^''^. Moreover, 

<\\H-HfEQ{\\Ur,tf\\Uo,rf) 



<\\H - HW JBci\\Ur,t\\^EQ\\Uo,r 



< C4 \\H — H\\ yEQlllC/r^f |||4EQ|[|C/o,r|||4, 

where we have used the Cauchy-Schwarz inequality and \\u\\ < 1 for v G 
S'^~^. Here \\\U\\\p = {J2ijUfj)^'^ is the elementwise p-norm of U, \\U\\ is 
the usual matrix 2-norm, and Cp matches the norms \\U\\ < Cp|||C/|||p (recall 
that all norms on a finite-dimensional space are equivalent). As f/o.r, Ur^t are 
solutions of linear stochastic differential equations, standard estimates give 
for any integer p > 2 



Eq sup \\\Ur,t\E] <Di{p)<^, 
\0<r<t ' 



Eq sup ||jC/o,r|||^ <D2{p)<^, 

and we obtain 

/ EQWUr^H - H)Uo^r'^f dr < s sup EQWUr^H - H)Uo,rJyf < 00. 

Jo ' ' 0<r<s 

Hence we can apply Lemma A. 5 to obtain the result. By a similar calculation 
we can establish that the right-hand side of the expression in Lemma A. 5 for 
our case is square integrable, so that the Skorokhod integral is well defined. 

D 

Lemma B.2. The anticipating ltd rule with S(x) =x/\x\ can be applied 
to 

Us,tUo,si^ = Uo^ty + r UrA^* - K*)Uo,ri^dr + /' UrAH - H)Uo,rl^dYr. 
Jo Jo 
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Proof. Clearly the Skorokhod integral term has a.s. continuous sample 
paths, as both f7s,tf^o,s^ and the time integrals do; moreover, C/o,tz^ G (D°°)'^. 
In order to be able to apply Proposition A. 6, it remains to check the tech- 
nical conditions Vr = Urti^* - A*)C/or'^ G (L^-^)"^, Ur = Urt{H - HWorf G 

As B°° is an algebra, ut and vt take values in D°°. Moreover, we can 
establish exactly as in the proof of Lemma B.l that u and v are in L^(Q x 
[0,t]). To complete the proof we must establish that 



E>. 



^Q 



{^sKYds 



J2 f eJ fiDsvlY ds 
, Jo Uo 



dr < oo. 



dr < oo, 



thus ensuring that u, f € (L 



1,4W 



and 



V rEqfr ['{D^Dsulfdsda 
,Jo Uo JO 



dr < oo 



which ensures that u G (L^'^) . Using the Cauchy-Schwarz inequality we 
have 



E/E 



^Q 



iD.v^yds 



dr 



<t EqWD gUrWi ds dr <t^ sup EQ|IDsUr|||, 

Jo Jo 0<r,s<t 



and similarly for v. Moreover, we obtain 



E > 



^Q 



JO 



[D^Dsuiydsda 



dr<r sup EQ|lDa-Dstir|||. 

0<r,s,a<t 



But using the chain rule Proposition A. 2 we can easily establish that 
Ur,t{H - H)Us,rHUo,sJ^, a.e. < s < r < t. 



and similarly 



Us,tHUr,siH - H)Uo^r'^, a.e. < r < s < t. 



D^D^Ur 



' Ur,t{H - H)Us,rHU^,sHUo,^U, 
Ur,t{H - H)U^,rHUs,aHUo,sl^, 
U^^tHUrAH - H)Us,rHUo,sy, 
Us,tHUr,s{H - H)U^^rHUo,^l^, 

Us,tHU^,sHUrAH - H)Uo,rl^, 



a.e. < a < s <r <t, 
a.e. 0<s<a<r<t, 
a.e. 0<s<r<a<t, 
a.e. <a <r < s <t, 
a.e. 0<r<a<s<t, 
a.e. 0<r<s<a<t. 



The desired estimates now follow as in the proof of Lemma B.l. □ 
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Lemma B.3. The Skorokhod integrand obtained by applying the antici- 
pative ltd formula as in Lemma B.2 is in DouiS. 

Proof. We use the notation pr = f7o,ri'- The Skorokhod integral in ques- 
tion is 

/' DT.{Ur,tPr)Ur,t{H - H)pr dYr = f /. dY,.. 

JO Jo 

To estabhsh / G Domd, it suffices to show that / G L^'^. We begin by show- 
ing 



{Ur,tPr 


)Ur,t{H - H)pr\ 




i j,k Wr,tPr\ 


-hVr 


<^J vutt\h^ h^\p':. 




— \Tj 1 Z ^ r^zrr 

Wr,tPr\ ij^k 




= dmax /i'' -h^\, 





{^sfr 



where we have used the triangle inequality, l^*-' — S'(x)| < 1 for any x E IR4.+ , 
and the fact that Ur^ and pr have nonnegative entries a.s. Hence fr is a 
bounded process. Similarly, we will show that D^/^ is a bounded process. 
Note that fr is a smooth function on M^^ of positive random variables in 
B°°; hence we can apply the chain rule Proposition A.l. This gives 

[ Y, D^T.'^\Ur,tPr){Ur,t{H " H)pry {Ur,tUs,rH Psf 
jk 

+ Y,Di:'^{ijr,tPr){UrAH - H)Us,rHps)\ 
3 

Y^D''T}^\Ur,tPr){Ur,t{H " H)pry {Us,tHUr,sPrf 
jk 

+ Y, DJ:'\Ur,tPr)iUs,tHUr,s{H " ^)pr)^ 

J 

Proceeding exactly as before, we find that D/ G L°°{Q. x [0,i]^). But then 
by Proposition A.l we can conclude that Dgfr G ID)"^'^ for a.e. {s,t) G [0,t]^, 
and in particular / G L^'^. Hence the proof is complete. D 

Lemma B.4. D^vrs = D-Kr^siT^r) ■ {H — h*Trr)T^r o-e. r < s, Dr-TTg = a.e. 
r > s. Moreover |(Dj.7rs)*| < max^ \h \ for every i. The equivalent results hold 
for DrTTg. In particular, this implies that vr^ and TTg are in D^'^. 



a.e. s <r, 



a.e. s > r. 
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Proof. The case r > s is immediate from adaptedness of vr^. For r < s, 
apply the chain rule to tt^ = S(C/o,sZ/) G I^ioc- Boundedness of the resulting 
expression follows, for example, as in the proof of Lemma B.3, and hence it 
follows that Tr^eB^'^. D 

Acknowledgment. R. van Handel thanks P. S. Krishnaprasad of the Uni- 
versity of Maryland for hosting a visit to the Institute for Systems Research, 
during which the this work was initiated. 

REFERENCES 

[1] Atar, R. and Zeitouni, O. (1997). Lyapunov exponents for finite state nonlinear 

filtering. SIAM J. Control OpUm. 35 36-55. MR1430282 
[2] Baxendale, p., Chigansky, p. and Liptser, R. (2004). Asymptotic stabifity of 

the Wonham filter: Ergodic and nonergodic signals. SIAM J. Control Optim. 43 

643-669. MR2086177 
[3] Bhatt, A. G., Kallianpur, G. and Karandikar, R. L. (1995). Uniqueness and 

robustness of solution of measure-valued equations of nonlinear filtering. Ann. 

Probab. 23 1895-1938. MR1379173 
[4] Bhatt, A. G., Kallianpur, G. and Karandikar, R. L. (1999). Robustness of the 

nonUnear filter. Stochastic Process. Appl. 81 247-254. MR1694557 
[5] Brigo, D., Hanzon, B. and Le Gland, F. (1999). Approximate nonlinear filter- 
ing by projection on exponential manifolds of densities. Bernoulli 5 495-534. 

MR1693600 
[6] Budhiraja, a. and Kushner, H. J. (1998). Robustness of nonlinear filters over the 

infinite time interval. SIAM J. Control Optim. 36 1618-1637. MR1626884 
[7] Budhiraja, A. and Kushner, H. J. (1999). Approximation and limit results for 

nonhnear filters over an infinite time interval. SIAM J. Control Optim. 37 1946- 

1979. MR1720146 
[8] Da Prato, G., Fuhrman, M. and Malliavin, P. (1999). Asymptotic ergodicity of 

the process of conditional law in some problem of non-linear filtering. J. Funct. 

Anal. 164 356-377. MR1695555 
[9] Elliott, R. J., Aggoun, L. and Moore, J. B. (1995). Hidden Markov Models. 

Springer, New York. MR1323178 
[10] Guo, X. and Yin, G. (2006). The Wonham filter with random parameters: Rate 

of convergence and error bounds. IEEE Trans. Automat. Control 51 460-464. 

MR2205683 
[11] KuNlTA, H. (1984). Stochastic differential equations and stochastic flows of diffeo- 

morphisms. Ecole d'Ete de Probabilites de Saint-Flour XII — 1982. Lecture Notes 

m Math. 1097 143-303. Springer, Berlin. MR0876080 
[12] Le Gland, F. and Oudjane, N. (2003). A robustification approach to stability and 

to uniform particle approximation of nonlinear filters: The example of pseudo- 
mixing signals. Stochastic Process. Appl. 106 279-316. MR1989630 
[13] Le Gland, F. and Oudjane, N. (2004). Stability and uniform approximation of 

nonUnear fllters using the Hilbert metric and application to particle filters. Ann. 

Appl. Probab. 14 144-187. MR2023019 
[14] Liptser, R. S. and Shiryaev, A. N. (2001). Statistics of Random Processes I. 

Springer, Berlin. MR1800857 



30 P. CHIGANSKY AND R. VAN HANDEL 

[15] NuALART, D. (1988). The Malliavin Calculus and Related Topics. Springer, New 
York. MR2200233 

[16] NuALART, D. and Pardoux, E. (1995). Stochastic calculus with anticipating inte- 
grands. Probab. Theory Related Fields 78 535-581. MR0950346 

[17] OCONE, D. L. and Karatzas, I. (1991). A generalized Clark representation formula, 
with application to optimal portfolios. Stochastics Stochastics Rep, 34 187-220. 
MR1124835 

[18] OcONE, D. and Pardoux, E. (1989). A generalized Ito-Ventzell formula. Application 
to a class of anticipating stochastic differential equations. Ann. Inst. H. Poincare 
Probab. Statist. 25 39-71. MR0995291 

[19] Papavasiliou, A. (2006). Parameter estimation and asymptotic stability in stochas- 
tic filtering. Stochastic Process. Appl. 116 1048-1065. MR2238613 

[20] Protter, p. E. (2004). Stochastic Integration and Differential Equations, 2nd ed. 
Springer, Berlin. MR2020294 

[21] Rogers, L. C. G. and Williams, D. (2000). Diffusions, Markov Processes, and 
Martingales. 2. ltd Calculus. Cambridge Univ. Press. MR1780932 

[22] WONHAM, W. M. (1965). Some applications of stochastic differential equations to 
optimal nonlinear filtering. J. Soc. Indust. Appl. Math. Ser. A Control 2 347- 
369. MR0186472 

Department of Mathematics Physical Measurement and Control 266-33 

The Weizmann Institute of Science California Institute of Technology 

Rehovot 76100 Pasadena, California 91125 

Israel USA 

E-MAIL: pavel.chigansky@weizmann.ac.il E-MAIL: ramon@its.caltech.edu 



